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Abstract. Fix a base B > 1 and let £ have the standard exponential distribution; the distribution of digits 
of £ base B is known to be very close to Benford's Law. If there exists a C such that the distribution of digits 
of C times the elements of some set is the same as that of f, we say that set exhibits shifted exponential 
behavior base B (with a shift of log s C mod 1). Let Xi, . . . , Xn be independent identically distributed 
random variables. If the -XV s are drawn from the uniform distribution on [0, L], then as N — > co the 
distribution of the digits of the differences between adjacent order statistics converges to shifted exponential 
behavior (with a shift of log B L/N mod 1). By differentiating the cumulative distribution function of the 
logarithms modulo 1, applying Poisson Summation and then integrating the resulting expression, we derive 
rapidly converging explicit formulas measuring the deviations from Benford's Law. Fix a S £ (0, 1) and choose 
N independent random variables from any compactly supported distribution with uniformly bounded first 
and second derivatives and a second order Taylor series expansion at each point. The distribution of digits 
of any N s consecutive differences and all N — 1 normalized differences of the order statistics exhibit shifted 
exponential behavior. We derive conditions on the probability density which determine whether or not the 
distribution of the digits of all the un-normalized differences converges to Benford's Law, shifted exponential 
behavior, or oscillates between the two, and show that the Pareto distribution leads to oscillating behavior. 



1. Introduction 

Benford's Law gives the expected frequencies of the digits in many tabulated data. It was first observed 
by Newcomb in the 1880s, who noticed that pages of numbers starting with a 1 in logarithm tables were 
significantly more worn than those starting with a 9. In 1938 Benford [Ben observed the same digit bias in 
a variety of phenomenon. From his observations he postulated that in many data sets more numbers began 
with a 1 than with a 9; his investigations (with 20,229 observations) supported his belief. See |Hil| IRaij for 
a description and history and [Hu] for an extensive bibliography. 

For any base B > 1 we may uniquely write a positive i£Masi = Mb{x) ■ B k , where k <E Z and Mb{x) 
(called the mantissa) is in [1, B). A sequence of positive numbers {a n } is Benford base B if the probability 
of observing a mantissa of a n base B of at most s is log B s. More precisely, for s £ [1,5] we have 

# { n<N:l<M B (a n )< s} = 

Benford behavior for continuous function^ are defined analogously. Thus base 10 the probability of observing 
a first digit of d is log 10 (d + 1) — log 10 (d), implying that about 30% of the time the first digit is a 1. 

We can prove many mathematical systems follow Benford's law, ranging from recurrence relations [BrDu 
to nl [DiaJ to iterates of power, exponential and rational maps and Newton's method [Hi2, BBH, BH to chains 
of random variables and hierarchical Bayesian models [JKKKM] to values of L-functions near the critical 
line to characteristic polynomials of random matrix ensembles and iterates of the 3x + 1-Map [KonMi| ILSj 
to products of random variables [MN| : we also see Benford's law in a variety of natural systems, such as 
atomic physics (Pa| . biology |CLTF] and geology [NMlj . Applications of Benford's Law range from rounding 
errors in computer calculations (see page 255 of Knu]) to detecting tax (see |Nigl| |Nig2| ) and voter fraud 
(see [Me)). 
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4f the functions are not positive, we study the distribution of the digits of the absolute value of the function. 
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This work is motivated by two observations (see Remark ll.9l for more details). First, since Benford's sem- 
inal paper, many investigations have shown that amalgamating data from different sources leads to Bcnford 
behavior; second, many standard probability distributions are close to Benford behavior. We investigate 
the distribution of digits of differences of adjacent ordered random variables. For any S < 1, if we study at 
most N s consecutive differences of a data set of size N, the resulting distribution of leading digits depends 
very weakly on the underlying distribution of the data, and closely approximates Benford's Law. We then 
investigate whether or not studying all the differences lead to Benford behavior; this question is inspired by 
the first observation above, and has led to new tests for data integrity (see [NM2] ). These tests are quick 
and easy to apply, and have successfully detected problems with some data sets, thus providing a practical 
application of our main results. 

To prove our results requires analyzing the distribution of digits of independent random variables drawn 
from the standard exponential, and quantifying how close the distribution of digits of a random variable with 
the standard exponential distribution is to Benford's Law. Leemis, Schmeiser and Evans |LSE| have observed 
that the standard exponential is quite close to Benford's Law; this was proved by Engel and Leuenberger 
|EL] , who showed that the maximum difference in the cumulative distribution function from Benford's Law 
(base 10) is at least .029 and at most .03. We provide an alternate proof of this result in the appendix 
using a different technique, as well as showing that there is no base B such that the standard exponential 
distribution is Benford base B (Corollary IA.2|) . 

Both proofs apply Fourier analysis to periodic functions. In [EL] the main step (their equation (5)) is 
interchanging an integration and a limit. Our proof is based on applying Poisson Summation to the derivative 
of the cumulative distribution function of the logarithms modulo 1, Fg. Benford's Law is equivalent to 
Fsib) — b, which by calculus is the same as F' B (b) = 1 and i*s(0) = 0. Thus studying the deviation of 
F' B (b) from 1 is a natural way to investigate the deviations from Benford behavior. We hope the details of 
these calculations may be of use to others in investigating related problems (Poisson Summation has been 
fruitfully used by Kontorovich-Miller [KonMi] and Jang-Kang-Kruckman-Kudo-Miller JKK KM] in proving 
many systems are Benford; see also |Pin] ) . 



1.1. Definitions. A sequence {a n }^ =1 C [0,1] is equidistributed if 

#{n:n<N a n e[a,b)} = 

N^oo N V ' 

for all [a, b] C [0, 1]. Similarly a continuous random variable on [0, oo) whose probability density function is 
p is equidistributed modulo 1 if 

lim J* Xff (*)*(»)<** =b _ a (L3) 

for any [a, b] C [0, 1], where Xa,b(x) — 1 for x mod 1 G [a, b] and otherwise. 

A positive sequence (or values of a function) is Benford base B if and only if its base B logarithms 
are equidistributed modulo 1; this equivalence is at the heart of many investigations of Benford's Law; see 
[Dial IMT-B] for a proof. 

We use the following notation for the various error terms: 

(1) Let £(x) denote an error of at most x in absolute value; thus f(b) — g(b)+£(x) means \f(b)—g(b)\ < x. 

(2) big-Oh notation: For g{x) a non-negative function, we say f{x) = 0(g(x)) if there exists an xq and 
a C > such that, for all x > xq, \f(x)\ < Cg{x). 

The following theorem is the starting point for investigating the distribution of digits of order statistics. 
Theorem 1.1. Let £ have the standard (unit) exponential distribution: 

Prob(C £ [a,/?]) = / e-*dt, [a,0\ G [0,oo). (1.4) 
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Forb £ [0, 1], let Fs(b) be the cumulative distribution function oflog B £ mod 1; thus Fs(b) := Prob(log B £ mod 
16 [0,6]). Then for all M > 2 

F'M = l + 2 f R e (e--»"r(l + g|)) 

m— 1 x x 77 

+ £ (4\/2 7 r Cl ( J B) e -^ 2 - C2 ( s )) M / lo s B ) , (1.5) 
where c\(B), 02(B) are constants such that for all m > M > 2 we have 

e 2Tr 2 m/logB _ e -27r 2 m/ log B 

m/ log-B 

! _ e -(7r 2 -c 2 (B))M/logB 

For -B € [e, 10] we may and 02(B) — 1/5, which give 

2r 

Prob(logC mod 1 € [a, b]) = b-a + — ■ sin(7r(6 + a) + 8) ■ sin(7r(& - a)) 

+ E (6.32- 10~ 7 ) , (1.7) 

with r « 0.000324986, 6> w 1.32427186, and 

2n 

Prob(log 10 £ mod 1 € [a, 6]) = 6 — a + sin(7r(6 + a) — 6\) • sin(7r(6 — a)) 

-— sin(27r(6 + a) + 2 ) • sin(27r(6 - a)) + £ (8.5 • 10" 5 ), (1.8) 

n w 0.0569573, 6>i w 0.8055888 

r 2 w 0.0011080, 6» 2 » 0.1384410. (1.9) 

The above theorem was proved in [ELj : we provide an alternate proof in the appendix. As remarked 
earlier, our technique consists of applying Poisson Summation to the derivative of the cumulative distribution 
function of the logarithms modulo 1; it is then very natural and easy to compare deviations from the resulting 
distribution and the uniform distribution (if a data set satisfies Benford's law, then the distribution of its 
logarithms is uniform) . Our series expansions are obtained by applying properties of the Gamma function. 

Definition 1.2 (Exponential Behavior, Shifted Exponential Behavior). Let Q have the standard exponential 
distribution, and fix a base B. If the distribution of the digits of a set is the same as the distribution of the 
digits of Q, then we say the set exhibits exponential behavior (base B). If there is a constant C > such 
that the distribution of digits of all elements multiplied by C is exponential behavior, then we say the system 
exhibits shifted exponential behavior (with shift of\og B C mod 1). 

We briefly describe the reasons behind this notation. One important property of Benford's Law is that it 
is invariant under rescaling; many authors have used this property to characterize Benford behavior. Thus if 
a data set is Benford base B and we fix a positive number C, so is the data set obtained by multiplying each 
element by C. This is clear if, instead of looking at the distribution of the digits, we study the distribution 
of the base B logarithms modulo 1. Benford's Law is equivalent to the logarithms modulo 1 being uniformly 
distributed (see for instance |Dia[ IMT-Bj ): the effect of multiplying all entries by a fixed constant simply 
translates the uniform distribution modulo 1, which is again the uniform distribution. 

The situation is different for exponential behavior. Multiplying all elements by a fixed constant C (where 
C 7^ B k for some k £ Z) does not preserve exponential behavior; however, the effect is easy to describe. 
Again looking at the logarithms, exponential behavior is equivalent to the base B logarithms modulo 1 
having a specific distribution which is almost equal to the uniform distribution (at least if the base B is not 
too large). Multiplying by a fixed constant C 7^ B k shifts the logarithm distribution by log B C mod 1. 



> e 2w2m / losB /cl(B) 

< e 2c 2 (B)m/\ogB 

> 1/V2. (1.6) 
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1.2. Results for Differences of Orders Statistics. We consider a simple case first, and show how the 
more general case follows. Let X\, . . . , Xn be independent identically distributed from the uniform distri- 
bution on [0, L]. We consider L fixed and study the limit as N — > oo. Let Xi-.isr, . . . , Xjy-.N be the X^s in 
increasing order. The Xj : jv are called the order statistics, and satisfy < -X^jv < X2.N < • • • < X^ : n < L. 
We investigate the distribution of the leading digits of the differences between adjacent X^'a, X i+ i :N — X i:N . 
For convenience we periodically continue the data and set X i+ ^.^ — Xj : jv + L. As we have N differences 
in an interval of size L, on average X i+ i : pf — X^ : jv is of size L/N, and it is sometimes easier to study the 
normalized differences 

Z ^ N - lJn • (L10) 

As the XiS are drawn from a uniform distribution, it is a standard result that as N — > 00 the Z^n's are 
independent random variables, each having the standard exponential distribution. Thus as N — > 00 the 
probability that Z^n £ [a, b] tends to J e^dt. See |DNl IRej for proofs. 

For uniformly distributed random variables, if we know the distribution of log B Zi-pf mod 1 then we can 
immediately determine the distribution of the digits of the -Xj+^jy — Xi-N base B because 

log B Z, ;JV = log B { Xl+1 ^ /N Xl:N ^j = log B (X i+1:N -X i:N )-log B (L/N). (1.11) 

As the Zi-jy are independent with the standard exponential distribution as N — > 00 if the Xi are inde- 
pendent uniformly distributed, the behavior of the digits of the differences X i+ i.^ — X i: N is an immediate 
consequence of Theorem 11.11 

Theorem 1.3 (Shifted Exponential Behavior of Differences of Independent Uniformly Distributed Random 
Variables). Let Xi,...,Xpj be independently distributed from the uniform distribution on [0, L], and let 
Xi : n, . . . , A/v-jv be the Xi 's in increasing order. As N — > 00 the distribution of the digits (base B) of the 
differences Xi-\-UN — Xi : jsr converges to shifted exponential behavior, with a shift of\og B {L/N) mod 1. 

A similar result holds for other distributions. 

Theorem 1.4 (Shifted Exponential Behavior of Subsets of Differences of Independent Random Variables). 
Let Xi, . . . , Xpj be independent, identically distributed random variables whose density f(x) has a second 
order Taylor series at each point with first and second derivatives uniformly bounded, and let the X i: pf 's be 
the Xi's in increasing order. Fix a 8 S (0,1). Then as N — » 00 the distribution of the digits (base B) of 
N s consecutive differences Xi + i : N — -2Q:iV converges to shifted exponential behavior, provided the Xi : w 's are 
from a region where f(x) is non-zero. 

The key ingredient in this generalization is that the techniques which show that the differences between 
uniformly distributed random variables become independent exponentially distributed random variables can 
be modified to handle more general distributions. 

We restricted ourselves to a subset of all consecutive spacings because the normalization factor changes 
throughout the domain. The shift in the shifted exponential behavior depends on which set of N s differ- 
ences we study, coming from the variations in the normalizing factors. Within a bin of N s differences the 
normalization factor is basically constant, and we may approximate our density with a uniform distribution. 
It is possible for these variations to cancel and yield Benford behavior for the digits of all the un-normalized 
differences. Such a result is consistent with the belief that amalgamation of data from many different dis- 
tributions becomes Benford; however, this is not always the case (see Remark ll.6[) . From Theorem 11.11 and 
Theorem II .41 we obtain 

Theorem 1.5 (Benford Behavior for all the Differences of Independent Random Variables). Let X\, . . . , X^ 

be independent, identically distributed random variables whose density f(x) is compactly supported and has a 
second order Taylor series at each point with first and second derivatives uniformly bounded. Let the X^n 's 
be the Xi 's in increasing order, F{x) be the cumulative distribution function for f{x), and fix a S € (0, 1). 
LetI(e,S,N) = [eN 1 - 6 , N 1 ' 6 - eN 1 - 5 } . For each fixed e € (0,1/2), assume that 
• fiF-^kN 5 - 1 ) is not too small for k S I(e, S, N) : 
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• log B f(F 1 (kN s 1 ) mod 1 is equidistributed: for all [a, [3] C [0, 1] 

.. #{fc G I(e,S,N) : log B fiF-^kN 5 - 1 )) mod 1 G [a, [3]} 



13 -a. 



(1.13) 




Then if e > max(0, 1/3 — 5/2) and e < 5/2, the distribution of the digits of the N—l differences Xi+i-ff — Xj : jv 
converges to Benford's Law (base B) as N — > oo. 

Remark 1.6. The conditions of Theorem \1.5\ are usually not satisfied. We are unaware of any situation 
where (|1.13p holds; we have included Theorem 1 1.5\ to give a sufficient condition of what is required to have 
Benford's law satisfied exactly, and not just approximately. In Lemma \3.3\ we show the conditions fail for 
the Pareto distribution, and the limiting behavior oscillates between Benford and a sum of shifted exponential 
behavior^ The arguments generalize to many densities whose cumulative distribution functions have tractable 
closed-form expressions (for example, exponential, Weibull, or f{x) — e~ e e x ). 

The situation is very different if instead we study normalized differences 



note if f{x) = 1/L is the uniform distribution on [0, L], (|1.14[) reduces to (|1.10[) . 

Theorem 1.7 (Shifted Exponential Behavior for All the Normalized Differences of Independent Random 
Variables). Assume the probability distribution f satisfies the conditions of Theorem ] 1.5\ and (|1.12[) and Z^ 
is as in (|1.14p . Then as N — > oo the distribution of the digits of the Z^n converges to shifted exponential 
behavior. 

Remark 1.8. Appropriately scaled, the distribution of the digits of the differences is universal, and is the 
exponential behavior of Theorem [7771 Thus Theorem \1.7\ implies that the natural quantity to study is the 
normalized differences of the order statistics, not the differences. See also Remark \3.5\ With additional work 
we could study densities with unbounded support and show that, through truncation, we can get arbitrarily 
close to shifted exponential behavior. 

Remark 1.9. The main motivation for this work is the need for improved ways of assessing the authenticity 
and integrity of scientific and corporate data. Benford's Law has been successfully applied to detecting income 
tax, corporate and voter fraud (see |Me[ [Nigl . Nig2j^; in |NM2j we use these results to derive new statistical 
tests to examine data authenticity and integrity. Early applications of these tests to financial data showed that 
it could detect errors in data downloads, rounded data, and inaccurate ordering of data. These attributes are 
not easily observable from an analysis of descriptive statistics, and detecting these errors can help managers 
avoid costly decisions based on erroneous data. 

The paper is organized as follows. We prove Theorem 1 1.1 1 in Appendix [Al by using Poisson summation to 
analyze F' B (b). Theorem 11.31 follows from results for the order statistics of independent uniform variables; 
the proof of Theorem II. 41 is similar, and given in <21 In JJ]we prove Theorems 11.51 and 11.71 



Theorem ll.3l is a consequence of the fact that the normalized differences between the order statistics drawn 
from the uniform distribution converge to being independent standard exponentials. The proof of Theorem 
11.41 proceeds similarly. Specifically, over a short enough region any distribution with a second order Taylor 
series at each point with first and second derivatives uniformly bounded is well-approximated by a uniform 
distribution. 

To prove Theorem ll.41 it suffices to show that if X\ , . . . , Xn are drawn from a sufficiently nice distribution, 
then for any fixed 5 G (0, 1) the limiting behavior of the order statistics of N d adjacent X^s becomes 
Poissonian (i.e., the N s — 1 normalized differences converge to being independently distributed from the 

2 If several data sets each exhibit shifted exponential behavior but with distinct shifts, then the amalgamated data set is closer 
to Benford's Law than any of the original data sets. This is apparent by studying the logarithms modulo 1. The differences 
between these densities and Benford's law will look like the plot on the right in Figure [T] (except, of course, that different shifts 
will result in shifting the plot modulo 1). The key observation is that the unequal shifts mean we do not have reinforcements 
from the peaks of the modulo 1 densities being aligned, and thus the amalgamation will decrease the maximum deviations. 



i+UN — ^-i-.N _ 



(1.14) 



l/Nf(X i:N ) 



2. Proofs of Theorem 11.31 and 11.41 
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standard exponential). We prove this below for compactly supported distributions f(x) that have a second 
order Taylor series at each point with the first and second derivatives uniformly bounded, and when the N s 
adjacent Xj's are from a region where f{x) is bounded away from zero. 

For each N, consider intervals [ajv,6jv] such that J^™ f(x)dx = N 5 /N; thus the proportion of the total 
mass in such intervals is N . We fix such an interval for our arguments. For each i e {1, . . . , N} let 

fl HXie [a N ,b N ] 
Wi = < (2.1) 
I otherwise. 

Note Wi is 1 with probability N s ~ x and with probability 1 — N s ~ 1 ; u>i is a binary indicator random variable, 
telling us whether or not Xi € [ajy,&iv]- Thus 



E 



' N 



= N d , Var(2^m ( = N d ■ (1 - A^ 1 ). (2.2) 

<i=l J 

Let Mm be the number of Xi in [ajv, 6jv], and let /3at be any non-decreasing sequence tending to infinity (in 
the course of the proof, we will find we may take any sequence with (3n — o(N s ^ 2 )). By (|2.2[) and the Central 
Limit Theorem (which we may use as the w^s satisfy the Lyapunov condition), with probability tending to 
1 we have 

M N = N 5 + 0{(3 N N 5/2 ). (2.3) 

We assume that in the interval [dN,b]y] there exist constants c and C such that whenever x € [ajv,6/v], 
< c < f(x) < C < oo; we assume these constants hold for all regions investigated and for all Thus 

c-{b N -a N ) < / f(x)dx = N 6 - 1 < C(b N -a N ), (2.4) 



implying that bjy — ajv is of size N s 1 . If we assume f(x) has at least a second order Taylor expansion, then 

f(x) = .f(a N )+f\a N )(x-a N )+0((x-a N ) 2 ) 

= f(a N ) + f'(a N )(x~a N )+0(N 25 - 2 ). (2.5) 

As we are assuming the first and second derivatives are uniformly bounded, as well as f being bounded away 
from zero in the intervals under consideration, all big- Oh constants below are independent of N . Thus 

b N -a N = —- + 0(N 26 - 2 ). (2.6) 
JK a N) 

We now investigate the order statistics of the Mjv of the Xj's that lie in [o«, 6jv]. We know J a N f(x)dx — 
N 5 ^ 1 ; by setting gN{x) = f(x)N 1 ^ s then gjsr(x) is the conditional density function for Xi, given that 
Xi € [aN, Thus gN{x) integrates to 1, and for x € [ajv, &/v] we have 

g N (x) = fia^-N 1 - 6 + f'(a N )(x-a N )-N 1 - s + 0(N S - 1 ). (2.7) 

We have an interval of size N^/fiaN) + 0(N 2S - 2 ), and M N = N s + 0{(3 N N S / 2 ) of the Xi lying in 
the interval (remember the (3n are any non-decreasing sequence tending to infinity). Thus with probability 
tending to 1, the average spacing between adjacent ordered Xi is 

^/fM+OiN 2 ^ 2 ) = {f{aN)N yi +N -i. 0{f3NN -S/2 +N S-i ) . (28) 
M N 

in particular, we see we must choose (3^ = o{N s / 2 ). As 5 £ (0, 1), if we fix a k such that X^ £ [a^, 6jv] then 
we expect the next Xi to the right of X^ to be about jqjh^pj units away, where t is of size 1. For a given 
Xk we can compute the conditional probability that the next Xi is between jjjh^p. and jjj^pj units to the 
right: it is simply the difference of the probability that all the other Mjy — 1 of the Xi's in [a at, 6/v] are not 
in the interval [Xf.,Xk + jjjf^s ] and the probability that all other Xi in [aN,biy] are not in the interval 

jv/(aw)]' note we are usm S tne wrapped interval [a N , b 



N 



^If our distribution has unbounded support, for any e > we can truncate it on both sides so that the omitted probability 
is at most e. Our result is then trivially modified to being within e of shifted exponential behavior. 
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Some care is required in these calculations. We have a conditional probability as we are assuming both 
Xk <E [aAr,6jv] and that exactly Mn of the Xj are in [ajv,6jv]- Thus these probabilities depend on two 
random variables, namely X k and M/y. This is not a problem in practice, however (for example, Mjv is 
tightly concentrated about its mean value). 

Recalling our expansion for <?jv(x) (and that b^ — a N = N 5 ^ 1 / '/(ajy) + 0(N 2S ~ 2 ) and t is of size 1), after 
simple algebra we find that, with probability tending to 1, for a given Xk and Mjv the first probability is 

/ Y t \ M N —1 

I f Xh+ Nf(a N ) \ 

[ 1 -J x 9N{x)dx\ . (2.9) 



The above integral equals tN s + 0(N 1 ) (use the Taylor series expansion in (|2.7p and note that the interval 
[ajsr, bfj] is of size 0(N S ~ 1 )). Using (|2.3[) . is easy to see that this is a.s. equal to 

l- ' + ^-'+^-^ f"". (2.10) 



M N 

We therefore find that as N — > oo the probability that Mm — 1 of the Xj's (i ^ fc) are in [ojv, bjv] \ [-Xfc) ^fc + 
t/Nf(ajsf)], conditioned on X^ and M^r, converges to e~*Q 

The calculation of the second probability, the conditional probability that the Mjv — 1 other X^s in 
[a 7v, b^] are not in the interval [X^, X^ + jjyn^ ) ] , given and Af/v, follows analogously by replacing t with 
t + At in the previous argument. We thus find that this probability is e~^ t+A *- ) . As 

rt+At 

e- u du = e~* -e^ t+At \ (2.11) 

t 

we find that the density of the difference between adjacent order statistics tends to the standard (unit) 
exponential density; thus the proof of Theorem 11.41 now follows from Theorem 11.31 

3. Proofs of Theorems 11.51 and 1 1 . 71 



We generalize the notation from ^ Let f(x) be any distribution with a second order Taylor series at each 
point with first and second derivatives uniformly bounded, and let -Xi : jvi • • • -Xn-n be the order statistics. 
We fix a 5 S (0, 1), and for k £ {1, . . . , iV 1- " 5 } we consider bins [a^N, Hin] such that 

f f(x)dx = N 5 /N = N 5 ' 1 ; (3.1) 

J a k-,N 

there are N 1_s such bins. By the Central Limit Theorem (see (|2.3p ). if M^n is the number of order statistics 
in [afc ; Ar, bfcjjv] then provided that e > max(0, 1/3 — 5/2) with probability tending to 1 we have 

M k , N = N s + 0{N t+s / 2 ); (3.2) 

of course, we also require e < 5/2, as otherwise the error term is larger than the main term. 

Remark 3.1. Before we considered just one fixed interval; as we are studying iV 1 intervals simultaneously, 
we need the e in the exponent so that with high probability all intervals have to first order N s order statistics. 
For the arguments below, it would have sufficed to have an error of size 0(N d ~ e ). We thank the referee for 
pointing out that e > 1/3 — 5/2, and provide his argument in Appendix l"Bl 

Similar to (|2.8[) . the average spacing between adjacent order statistics in [a^- n , b^-jsr] is 

(fiak.^N)- 1 + N- 1 ■ 0(N-^ +5 / 2 1 + N 6 - 1 ). (3.3) 

Note (|3.3[) is the generalization of ((TTTTJ ; if / is the uniform distribution on [0,L] then /(dfc ; jv) = 1/L. 
By Theorem 11.41 as N — > oo the distribution of digits of the differences in each bin converges to shifted 
exponential behavior; however, the variation in the average spacing between bins leads to bin-dependent 
shifts in the shifted exponential behavior. 



^Some care is required, as the exceptional set in our a.s. statement can depend on t. This can be surmounted by taking 
expectations with respect to our conditional probabilities and applying the dominated convergence theorem. 
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Similar to {UTTJ) , we can study the distribution of digits of the differences of the normalized order statistics. 
If Xi-.N and Xi+i : N are in [ak-N, bfe ; jv] then 

Zi-, N = (X i+1:N - X i:N ) / (tJia^N)- 1 + N- 1 • 0(N-(' +s '*> + N s ~ 1 )) 
log B Z l:N = log B (X i+1:N - X i:N ) + log B N - log B (/K^)- 1 + 0{N^+ 5 ^ + A^ 1 )) . 

(3.4) 

Note we are using the same normalization factor for all differences between adjacent order statistics in a bin. 
Later we show we may replace f(ak;N) with f{X^N). As we study all Xi+x-.N — X^n in the bin [ak-N, &&;jv]j 
it is useful to rewrite the above as 

\og B (X l+1:N - X l:N ) = log B Z,, N - log B N + log B (/(a^)- 1 + 0{N-^+W + N 5 ^ . 

(3.5) 

We have N bins, so ke {1, . . . ,N 1 - 6 }. As we only care about the limiting behavior, we may safely ignore 
the first and last bins. We may therefore assume each ak-N is finite, and ak+i-N = &fc;./vH 
Let F(x) be the cumulative distribution function for f(x). Then 

F(a k -,N) = {k-l)N s /N = (k-l)^- 1 . (3.6) 

For notational convenience we relabel the bins so that k £ {0, . . . , iV 1- " 5 — 1}; thus F(ak-,N) = kN s ~ x . 

We now prove our theorems which determine when these bin-dependent shifts cancel (yielding Bcnford 
behavior), or reinforce (yielding sums of shifted exponential behavior). 

Proof of Theorem \1.5\ There are approximately N s differences in each bin [a^N ,bk-,N]- By Theorem 11.41 
the distribution of the digits of the differences in each bin converges to shifted exponential behavior. As 
we assume the first and second derivatives of / are uniformly bounded, the big-Oh constants in f}2] are 
independent of the bins. The shift in the shifted exponential behavior in each bin is controlled by the last 
two terms on the right hand side of (|3.5p . The log B N shifts the shifted exponential behavior in each bin 
equally. The bin-dependent shift is controlled by the final term, 



logs (/( 



a fe;W )- 1 +0(iV- (£+5/2) +iV 5 - 1 )) 



/ min(Ar-( e+5 / 2 ),iV 5 - 1 )\ , s 

= -log B /(a fc:Ar ) + log B 1 + ^7 f '-)■ (3.7) 

Thus each of the N 1 ^ 6 bins exhibits shifted exponential behavior, with a bin-dependent shift composed 
of the two terms in ([3~7]) . By (fTTT2|) . the f(a k;N ) are not small compared to mm(N^ ( - t+s / 2 \ N s ^ r ), and 

hence the second term log B ^1 + mm ( N j s negligible. In particular, this factor depends only 

very weakly on the bin, and tends to zero as N — > oo. 

Thus the bin-dependent shift in the shifted exponential behavior is approximately — log B /(afe ; jv) = 
— log B f(F~ 1 (kN 5 ' 1 )). If these shifts are equidistributed modulo 1, then the deviations from Benford 
behavior cancel, and the shifted exponential behavior of each bin becomes Benford behavior for all the 
differences. □ 

Remark 3.2. Consider the case when the density is a uniform distribution on some interval. Then all 
f(F~ 1 (kN s ~ 1 )) are equal, and each bin has the same shift in its shifted exponential behavior. These shifts 
therefore reinforce each other, and the distribution of all the differences is also shifted exponential behavior, 
with the same shift. This is observed in numerical experiments; see Theorem \1.3\ for an alternate proof. 



We analyze the assumptions of Theorem 11.51 The condition from (|1.12|) is easy to check, and is often 
satisfied. For example, if the probability density is a finite union of monotonic pieces and is zero only 
finitely often, then (fTTT2|) holds. This is because for k £ I(e,5,N), F^ikN 6 - 1 ) £ [F ^(e), F~ l (l - e)] and is 
therefore independent of N (if / vanishes finitely often, we need to remove small sub-intervals from J(e, 6, N), 



^Of course, we know both quantities are finite as we assumed our distribution has compact support. We remove the last 
bins to simplify generalizations to non-compactly supported distributions. 
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but the analysis proceeds similarly). The only difficulty is basically a probability distribution with intervals 
of zero probability. Thus (|1.12|) is a mild assumption. 

If we choose any distribution other than a uniform distribution, then f{x) is not constant; however, (|1.13[) 
need not hold (i.e., log B f{a,k-N) mod 1 need not be equidistributed as N — > oo). For example, consider a 
Pareto distribution with minimum value 1 and exponent a > 0. The density is 

/( ) |aa; _a ~ 1 if x > 1 

1 otherwise. 

The Pareto distribution is known to be useful in modeling natural phenomena, and for appropriate choices 
of exponents yields approximately Benford behavior (see |NMlj ). 

Example 3.3. If f is a Pareto distribution with minimum value 1 and exponent a > 0, then f does not 
satisfy the second condition of Theorem \1.5l equation (|1.13|) . 

To see this, note that the cumulative distribution function of f is F(x) — l~x~ a . As we only care about the 
limiting behavior, we need only study k S I(e,5,N) = [e-ZV 1-5 , iV 1_<s — eN 1 ^ 6 }. Therefore F(a k ;jy) = kN s ~ 1 
implies that 

a k , N = {l-kN 5 - 1 )- 1 /*, f(a k . N ) = a(l - kN 5 - 1 )^ 1 . (3.9) 
The condition from (|1.12[) is satisfied, namely 

mm(N- ( - t+5 / 2 \N 5 - 1 ) , mm(N~^+ s ^ , N 6 ' 1 ) 
hm max = hm max = 0, (3.10) 

N->ookei(e,S,N) f(a k -N) n^oo kei(e,s,N) a(kN A - 1 y a + 1 >/ a 

as k is of size N 1 ^ 6 . 

Let j = N^-~ s — k G I(e, 5, N). Then the bin-dependent shifts are 

logs /KaO - — log B (l - kN 5 - 1 ) + log B a 
a 



a 



logsO'^-^+logsa 



= logs (/ Q+1 ^ a ) + log B (ai V ( 1 - 5 )( Q + 1 )/ Q ) . (3.11) 

Thus, for a Pareto distribution with exponent a, the distribution of all the differences becomes Benford if and 
only if j( Q + 1 )/ Q is Benford. This follows from the fact that a sequence is Benford if and only if its logarithms 
are equidistributed. For fixed m, j m is not Benford (see for example DiaJ^I, and thus the condition from 
(flTTBl) fails. 

Remark 3.4. We chose to study a Pareto distribution because the distribution of digits of a random variable 
drawn from a Pareto distribution converges to Benford behavior (base 10) as a — > 1; however, the digits of 
the differences do not tend to Benford (or shifted exponential) behavior. A similar analysis holds for many 
distributions with good closed-form expressions for the cumulative distribution function. In particular, if f is 
the density of an exponential or Weibull distribution (or f(x) = er e e x ), then f does not satisfy the second 
condition of Theorem ] 1.51 equation (| 1 . 1 3|) . 

Modifying the proof of Theorem 11.51 yields our result on the distribution of digits of the normalized 
differences. 

Proof of Theorem \1.7\ If / is the uniform distribution, there is nothing to prove. For general /, rescaling 
the differences eliminates the bin-dependent shifts. Let 

~ = X^-X N 

l/Nf(X i:N ) V ' 



In Theorem II .51 we use the same scale factor for all differences in a bin; see (|3.4|) . As we assume the first and 
second derivatives of / are uniformly bounded, ()2.5() and (|2.6p imply that for X^n G [ofc ; jv, ^/c;jv], 

/ PQ : jv) = f(a k ;N) + O (b k;N - a k;N ) 

f N 6 - 1 \ 

= f(a k - N ) + - + N 25 - 2 , (3.13) 

\f(ak-N) J 
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and the big-Oh constants are independent of k. As we assume / satisfies (|1 . 12|) . the error term is negligible. 

Thus our assumptions on / imply that / is basically constant on each bin, and we may replace the local 
rescaling factor /(Xj : jv) with the bin rescaling factor f(ak-N)- Thus each bin of normalized differences has 
the same shift in its shifted exponential behavior. Therefore all the shifts reinforce, and the digits of all the 
normalized differences exhibit shifted exponential behavior as N — ► oo. □ 

As an example of Theorem 1 1.71 in Figure[T]we consider 500,000 independent random variables drawn from 
the Pareto distribution with exponent 



4+ ^19-3^33+ VW + 3\/33 



(3.14) 



(we chose a to make the variance equal 1). We study the distribution of the digits of the differences in base 
10. The amplitude is about .018, which is the amplitude of the shifted exponential behavior of Theorem ll.il 
(see the equation in Theorem 2 of |EL| or (| 1 . 5|) of Theorem II. ip . 




. 005 




FIGURE 1. All 499,999 differences of adjacent order statistics from 500,000 independent 
random variables from the Pareto distribution with minimum value and variance 1. (left) 
Observed digits of scaled differences of adjacent random variables versus Benford's law; 
(right) Scaled observed minus Benford's Law (cumulative distribution of base 10 logarithms). 



Remark 3.5. The universal behavior of Theorem \ 1 . 7| suggests that if we are interested in the behavior of the 
digits of all the differences, the natural quantity to study is the normalized differences. For any distribution 
with uniformly bounded first and second derivatives and a second order Taylor series expansion at each point, 
we obtain shifted exponential behavior. 

Appendix A. Proof of Theorem 11.11 

To prove Theorem 11.11 it suffices to study the distribution of log B C, mod 1 when £ has the standard 
exponential distribution; see (|1.4p . We have the following useful chain of equalities. Let [a, b] C [0,1]. Then 

oo 

Prob(log B C mod 1 € [a,b]) = ^ Prob(log B C e [a + k, b + k]) 

k— — oo 

CO 

Prob(C G [B a+k ,B b+k ]) 

k— — oo 

CO 

= jr (e- Ba+k - e- Bb+k ) . (A.l) 

k— — oo 

It suffices to investigate (|A.1[) in the special case when a = 0, as the probability of any interval [a, [3] can 
always be found by subtracting the probability of [0, a] from [O,0\. We are therefore led to studying, for 
b G [0, 1], the cumulative distribution function of log s £ mod 1: 

oo 

F B (b) := Prob(log B C mod 1 S [0,6]) - ^ (e^* - e^'") . (A.2) 

k— — co 
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This series expansion converges rapidly, and Benford behavior for £ is equivalent to the rapidly converging 
series in (|A.2|) equalling b for all b. 

As Benford behavior is equivalent to Fsib) equals b for all b £ [0, 1], it is natural to compare F' B (b) to 
1. If the derivative were identically 1 then F B (b) would equal b plus some constant. However, (|A.2[) is zero 
when 6 = 0, which implies that this constant would be zero. It is hard to analyze the infinite sum for F B (b) 
directly. By studying the derivative F' B (b) we find a function with an easier Fourier transform than the 
Fourier transform of e~ B — e~ B + , which we then analyze by applying Poisson Summation. 

We use the fact that the derivative of the infinite sum Fb (b) is the sum of the derivatives of the individual 
summands. This is justified by the rapid decay of the summands; see, for example, Corollary 7.3 of [Laj . We 
find 

OC OO 

F' B (b) = J2 e- B " +k B b + k logB = £ e^ 13 " [3B k \ogB, (A.3) 

k— — oc k— — oo 

where for b E [0, 1] we set /3 = B b . 

Let H(t) — e~@ B (3B 1 logB; note (3 > 1. As H(t) is of rapid decay in t, we may apply Poisson Summation 
(see for example [SS]). Thus 

oo oo 

£ H(k) = ]T H(k), (A.4) 

k— — oc k— — oc 

where H is the Fourier Transform of H: H(u) = J"^^ H(t)e~ 27Tttu dt. Therefore 

OO OO OO „qq 

F' B {b) = H ^ = H(k) = e- f3Bt f3 B* log B -e- 2nttk dt. (A.5) 

k— — oo k=~oo k— — oo — 00 

Let us change variables by taking w — B* . Thus dw — B t \ogB dt or ^ = logi? dt. As e~ 

log S-j— 27rifc _ w -27rife/logB wg nave 

FUb) = y r e-^Pww- 2 ^/^ 3 — 

V 0***l*>*B / e - u u- 2 ^ l ^ B du 

.._ L Jo 



-2-xitk 



k— — oc 



2nik 



= E /5 2 - fc/logB r , (A.6) 

where we have used the definition of the T-function: 

/•OO 

r(s) = / e-V -1 du, Re(s) > 0. (A.7) 



As r(l) = 1 we have 

oo 

F' B {b) = i+E 



g 2nirn/iogB T I _ 2rnm\ 27r4m/ log B f 2mm \ 

1 logB ) h \ \ogB J 



(A. 



Remark A.l. The above series expansion is rapidly convergent, and shows the deviations o/log B £mod 1 
from being equidistributed as an infinite sum of special values of a standard function. As j3 — B b we have 
p2irim/logB _ C0S (2 7rTO 6) + i s in(2TTmb) , which gives a Fourier series expansion for F'(b) with coefficients 
arising from special values of the T -function. 

We can improve (|A.8j) by using additional properties of the T-function. If y £ R then from (|A.7j) we have 
F(l — iy) = T(l + iy) (where the bar denotes complex conjugation). Thus the m th summand in (|A.8j) is the 
sum of a number and its complex conjugate, which is simply twice the real part. We have formulas for the 
absolute value of the T-function for large argument. We use (see (8.332) on page 946 of [GRj l that 

. .n TTX 2?TX . . 
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Writing the summands in fO|) as 2Re ^ e -^ imb r 



M-l 



F' B {b) 



2E Re 

m— 1 



2 £ Rc 



+ (HU becomes 



r i 



logB 



logsj 



(A.10) 



The rest of the claims of Theorem 11.11 follow from simple estimation, algebra and trigonometry. □ 
With constants as in the theorem, if we take M = 1 and B = e (resp., B = 10) the error is at most .00499 
(resp., .378), while if M = 2 and B = e (resp., B = 10) the error is at most 3.16 • 10 -7 (resp., .006). Thus 
just one term is enough to get approximately five digits of accuracy base e, and two terms give three digits 
of accuracy base 10! For many bases we have reduced the problem to evaluating Re 

This example illustrates the power of Poisson Summation, taking a slowly convergent series expansion and 
replacing it with a rapidly converging one. 

Corollary A. 2. Let C, have the standard exponential distribution. There is no base B > 1 such that Q is 
Benford base B. 

Proof. Consider the infinite series expansion in (|1.5|) . As e - 2 ™" b i s a sum f a cosine and a sine term, (|1.5[) 
gives a rapidly convergent Fourier series expansion. If £ were Benford base B, then F' B (b) must be identically 

1; however, T ^1 + ^ is never zero for m a positive integer because its modulus is non-zero (see (|A.9[0 . 

As there is a unique rapidly convergent Fourier series equal to 1 (namely, g(b) = 1; see for a proof), our 
F' B {b) cannot identically equal 1. □ 

Appendix B. Analyzing A 1-5 intervals simultaneously 

We show why in addition to e > we also needed e > 1/3—5/2 when we analyzed A 1-5 intervals 
simultaneously in (|3.2[) ; we thank one of the referees for providing this detailed argument. 



Let Yx,...,Y N be iidrv with E[K ( ] = 0, Var(r t ) = a 2 , E[\Yi\ 3 ] < oo, and set S N = (Y 1 + --- + Y N )/VNa 2 . 
Let $(z) denote the cumulative distribution function of the standard normal. Using a (non-uniform) sharp- 
ening of the Berry-Esseen estimate (see, for example, |Pej ) . we find that for some constant c > 

cE[|Yi| 3 ] 



|Prob(5jv <x)- $(x)| < 



O" 



N(l + \x\f 



x G 



N > 1. 



(B.l) 



Taking Y l 



N 



" 1 , where is defined by (|2.ip . yields 



5' 



A d 



v 



^A^l- A 5 - 1 ) 
= A 5_1 (l — A" 5-1 ) 



< 



Thus (jB.lj) becomes 



Prob 



Mjy — N 



2N 



< x 



s-i 



y/N s (l - N 5 

for all N > N (for some N sufficiently large, depending on 6) 



$(ac) 



< 



3cA-^ 2 

(i + N) 3 



(B.2) 
(B.3) 



For each A, k and e consider the event 
Then as A — > oo we have 




€ [-A e ,A e 



Prob 



(B.4) 



(B.5) 
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provided that 



as N — > oo. Using ()B.3|) gives 



N 



E Prob (Akk,e) - ( B -6) 



fc=l 



fi f jV- < / 2 

Prob(A^ e ) < J — W¥ + 2 (l~^)) 



< 6cN- s/2 - 3e + y^iV- £ exp(-7V 2 72) (B.7) 



(see, for example, Fc ). Thus the sum in (|B.6|) is at most 



6ciV i-35/2-3 £ + / ^-N 1 - 6 -" exp(-JV 2 72), (B.8) 
and this is o(l) provided that e > and e > 1/3 — <5/2. 
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